Memory systems including a duplicate removing filter module that is separate from a cache module

ABSTRACT

A memory system includes a cache module configured to store data. A duplicate removing filter module is separate from the cache module. The duplicate removing filter module is configured to receive read requests and write requests for data blocks to be read from or written to the cache module, selectively generate fingerprints for the data blocks associated with the write requests, selectively store at least one of the fingerprints as stored fingerprints and compare a fingerprint of a write request to the stored fingerprints.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.61/667,051, filed on Jul. 2, 2012. The entire disclosure of theapplication referenced above is incorporated herein by reference.

FIELD

The present disclosure relates to systems using a cache, and moreparticularly to a duplicate removing filter module for the cache.

BACKGROUND

The background description provided herein is for the purpose ofgenerally presenting the context of the disclosure. Work of thepresently named inventors, to the extent the work is described in thisbackground section, as well as aspects of the description that may nototherwise qualify as prior art at the time of filing, are neitherexpressly nor impliedly admitted as prior art against the presentdisclosure.

A host device such as a computer, smart phone or other device includesmemory to store data. The memory may include a cache that is used toimprove system performance. The cache stores data so that futurerequests for that data can be handled more quickly. The data that isstored within the cache may include values that were previously usedand/or copies of values that are stored elsewhere.

If data requested by the host is stored in the cache, a cache hit occursand the cache returns the data. Otherwise a cache miss occurs and thedata is fetched from its original storage location. Performance improvesas the number of cache hits relative to cache misses increases. However,cache is more expensive than standard memory. Therefore, the cache isusually quite a bit smaller than the standard memory. Designers tend tobalance cost (which increases with cache size) and performance (whichalso increases with cache size). As can be appreciated, management ofthe cache can significantly improve cache performance. Since the cacheis relatively small, it is important to remove duplicate data.

SUMMARY

A memory system includes a cache module configured to store data. Aduplicate removing filter module is separate from the cache module. Theduplicate removing filter module is configured to receive read requestsand write requests for data blocks to be read from or written to thecache module, selectively generate fingerprints for the data blocksassociated with the write requests, selectively store at least one ofthe fingerprints as stored fingerprints and compare a fingerprint of awrite request to the stored fingerprints.

In other features, the duplicate removing filter module is configured tosend one of the data blocks associated with the write request, acorresponding logical block address and the fingerprint of the writerequest to the cache module when the fingerprint of the write requestdoes not match any of the stored fingerprints.

In other features, the duplicate removing filter module is configuredto, when the fingerprint of the data block for the write request matchesone of the stored fingerprints, send a logical block address and a cachereference corresponding to a matching one of the stored fingerprints tothe cache module.

In other features, the duplicate removing filter module is configured toread one of the data blocks associated with a read request from thecache module when a read hit occurs.

In other features, the duplicate removing filter module is configured toread one of the data blocks associated with a read request from abackend data store when a read miss occurs; send the one of the datablocks from the backend data store to an application of a host device;generate a fingerprint for the one of the data blocks; and send the oneof the data blocks, a corresponding logical block address and thefingerprint for the one of the data blocks to the cache module.

In other features, when the cache module evicts one of the data blocks,the cache module identifies whether the one of the data blocks is aduplicate or unique and sends a cache reference and the fingerprintcorresponding to the one of the data blocks to the duplicate removingfilter module.

Further areas of applicability of the present disclosure will becomeapparent from the detailed description, the claims and the drawings. Thedetailed description and specific examples are intended for purposes ofillustration only and are not intended to limit the scope of thedisclosure.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is an example of a functional block diagram of a memory systemincluding a cache module, a duplicate removing filter module, and abackend data store including a standard file system and a file systemobject according to the present disclosure.

FIG. 2 illustrates an example of a diagram for the memory system of FIG.1 according to the present disclosure.

FIG. 3 illustrates an example of a diagram for a write operation for thememory system of FIG. 1 according to the present disclosure.

FIG. 4 illustrates an example of a cache metadata manager for a writeoperation for the memory system of FIG. 1 according to the presentdisclosure.

FIG. 5 is a flowchart illustrating an example of a method for writingdata according to the present disclosure.

FIG. 6 is a flowchart illustrating an example of a method for readingdata according to the present disclosure.

FIG. 7 illustrates a workflow for an example of a read hit according tothe present disclosure.

FIG. 8 illustrates a workflow for an example of a read miss according tothe present disclosure.

FIG. 9 illustrates a workflow for an example of an eviction according tothe present disclosure.

In the drawings, reference numbers may be reused to identify similarand/or identical elements.

DESCRIPTION

FIG. 1 shows a memory system 50 including a cache module 64, a duplicateremoving filter module 66, and a backend data store 67 including a filesystem 68 and a file system object 72. In some examples, the file systemobject 72 is a standard Linux file system. Examples of the cache module64 in the memory system 50 may include a solid state drive (SSD). Thecache module 64 may be a monolithic integrated circuit.

Removal of duplicates is typically performed exclusively by the cachemodule 64. However, this approach may tend to limit performance sincethe processing power of the cache module 64 is somewhat limited ascompared to processing power available at a host device. According tothe present disclosure, the memory system 50 creates an additional layerthat is separate from the cache module 50 to perform some of theprocessing relating to removing duplicates.

FIG. 2 illustrates a workflow for the memory system of FIG. 1. In awrite-back operation, an application 80 associated with a host outputs awrite request for a data block to the duplicate removing filter module66. A fingerprint managing module 82 in the duplicate removing filtermodule 66 calculates a fingerprint for the data block and adds thefingerprint to a table 84. The fingerprint has a smaller size than thedata block. In some examples, the fingerprint may be generated using ahash function that uniquely identifies data with a high probability. Insome examples, the fingerprint managing module 82 generates thefingerprint using an algorithm such as SHA1 or SHA256, although otheralgorithms can be used. The duplicate removing filter module 66 sendsthe data block, the fingerprint and a corresponding logical blockaddress to the cache module 64.

In a write-thru operation, the application 80 outputs a write requestthat is identified as a write-thru operation to the duplicate removingfilter module 66. The duplicate removing filter module 66 operates asabove. At the same time the data block will be sent directly to thebackend data store 67 without duplicate removal.

In a read hit operation, the application 80 outputs a read request tothe duplicate removing filter module 66. The duplicate removing filtermodule 66 checks the cache module 64 to determine whether the requesteddata is stored in the cache module 64. Duplicate removal will not berequired because the duplicate removing filter module 66 points to thedata block that is requested and sends the data back to the application80.

In a read miss operation, the application 80 outputs a read request tothe duplicate removing filter module 66. The duplicate removing filtermodule 66 checks with the cache module 64 to determine whether therequested data block is stored in the cache module 64. If not, theduplicate removing filter module 66 reads the data block from thebackend data store 67 and sends the data block back to the application80. The duplicate removing filter module 66 sends the data (eithersynchronously or asynchronously) to the cache module 64 to populate thecache module 64 for subsequent read iterations.

FIGS. 3 and 4 illustrate a workflow for a write operation for the memorysystem of FIG. 1. At 90, the application 80 outputs a first writerequest for logical block address X to the duplicate removing filtermodule 66. The fingerprint managing module 82 calculates the fingerprintand checks in the table 84 to determine whether the fingerprint alreadyexists. If not, the duplicate removing filter module 66 sends thelogical block address of X to the cache module 64 along with the datablock D1 and the corresponding fingerprint. A metadata manager 96 of thecache 64 maintains the fingerprint of X along with X pointing to datablock D1. The cache module 64 will also send a cache reference c_(ref)back to the duplicate removing filter module 66.

At 94, the application 80 outputs a second write request correspondingto LBA Y to the duplicate removing filter module 66. The fingerprintmanaging module 82 calculates the fingerprint of the block and findsthat the fingerprint is a duplicate of X. The duplicate removing filtermodule 66 sends Y and the cache reference c_(ref) to cache module 64 andincrements a fingerprint counter of X by 1 in the table 84. Theduplicate removing filter module 66 maintains the cache reference of Ypointing to the data block of X. The duplicate removing filter module 66sends the cache reference to the cache module to let the cache moduleknow that Y is a duplicate of X. In FIG. 4, X and Y both point to datablock D1 in cache metadata that is managed by a cache metadata managingmodule 96.

FIG. 5 illustrates an example of a method for writing data according tothe present disclosure. At 120, control determines whether a writerequest with incoming data has been received at the duplicate removingfilter module from the application. At 124, control determines whether afingerprint for the data block exists. If 124 is true, control continuesat 128 and increases the corresponding fingerprint count by one. At 132,control sends the logical block address and corresponding cachereference from the table to the cache module. The cache referencecorresponds to the matched fingerprint in the table.

If 124 is false, the duplicate removing filter module sends the datablock, logical block address and the fingerprint to the cache module. At144, the cache module sends the cache reference c_(ref) to the duplicateremoving filter module.

FIG. 6 is a flowchart illustrating an example of a method for readingdata according to the present disclosure. At 180, control determineswhether a read request has been received at the duplicate removingfilter module from the application. If 180 is true, control determineswhether there is a read hit at 184. If 184 is true, control reads thedata block from the cache module at 188 and control ends.

If 184 is false, control reads the data block from the backend datastore at 192. At 196, the duplicate removing filter module sends thedata block back to the application. The duplicate removing filter modulealso sends the data block to the cache module along with thefingerprint, the data block and the logical block address.

FIG. 7 illustrates a workflow for an example of a read hit according tothe present disclosure. At 204, the application 80 outputs a readrequest for LBA Y to the duplicate removing filter module 66. Theduplicate removing filter module 66 sends the request for LBA Y to thecache module 64. The cache metadata points to data block D1. The cachemodule 64 returns the LBA Y and the data block D1 to the duplicateremoving filter module 66. The duplicate removing filter module 66 sendsthe data block D1 to the application 80.

FIG. 8 illustrates a workflow for an example of a read miss according tothe present disclosure. The application 80 sends a read request for LBAZ to the duplicate removing filter module 66. The duplicate removingfilter module 66 sends the request for the LBA Z and the fingerprint tothe cache module 64 and/or generates the fingerprint and checks to seeif it is present in the table. In this example, the LBA Z is not presentin the cache module. The duplicate removing filter module 66 sends arequest for the LBA Z to the backend data store. The backend data storesends the data block D2 corresponding to the LBA Z to the duplicateremoving filter module 66. The duplicate removing filter module 66calculates the FP and sends the FP, the data block D2 and the LBA Z tothe cache module. The cache metadata managing module 96 adds metadata toan existing database stored therein. The LBA Z points to data block D2in the cache module 64. The cache module 64 sends the cache referencefor the LBA Z to the duplicate removing filter module 66 for subsequentread requests.

FIG. 9 illustrates a workflow for an example of an eviction according tothe present disclosure. The table 82 in the duplicate removing filtermodule 66 maintains a counter storing the number of duplicatefingerprints pointing to a single data block. When the cache module 64evicts one of the data blocks, the cache module 64 sends the cachereference c_(ref) and the fingerprint to the duplicate removing filtermodule 66 to decrement the counter maintained by the duplicate removingfilter module 66. For duplicate block eviction, the cache moduleinstructs the duplicate removing filter module 66 to reduce the counter.For unique block eviction, the counter will be reduced along with thedata block eviction. Unique blocks are blocks that do not haveduplicates.

Advantages include increasing the read hit ratio for the cache. Thepenalty incurred due to read-miss cycles in such designs will bereduced. However, there will be a latency increase due to increasedcycles at the duplicate removing filter module.

The memory system according to the present disclosure separates theduplicate removing filter module 66 from the cache module 64 across adefined and complete set of interfaces. For example only, in oneconfiguration the duplicate removing filter module 66 can be located onor associated with the host device while the cache module 64 can resideon a Peripheral Component Interconnect Express (PCIe) card. Alternately,the duplicate removing filter module 66 and the cache module 64 can beassociated with the host or the PCIe card.

Each of these configurations has its own benefits with respect to CPUutilization, Plug-n-Play properties, performance limits, etc. Theseparation also allows the duplicate removing filter module 66 and thecache module 64 to be developed by two different parties.

The foregoing description is merely illustrative in nature and is in noway intended to limit the disclosure, its application, or uses. Thebroad teachings of the disclosure can be implemented in a variety offorms. Therefore, while this disclosure includes particular examples,the true scope of the disclosure should not be so limited since othermodifications will become apparent upon a study of the drawings, thespecification, and the following claims. As used herein, the phrase atleast one of A, B, and C should be construed to mean a logical (A or Bor C), using a non-exclusive logical OR. It should be understood thatone or more steps within a method may be executed in different order (orconcurrently) without altering the principles of the present disclosure.

In this application, including the definitions below, the term modulemay be replaced with the term circuit. The term module may refer to, bepart of, or include an Application Specific Integrated Circuit (ASIC); adigital, analog, or mixed analog/digital discrete circuit; a digital,analog, or mixed analog/digital integrated circuit; a combinationallogic circuit; a field programmable gate array (FPGA); a processor(shared, dedicated, or group) that executes code; memory (shared,dedicated, or group) that stores code executed by a processor; othersuitable hardware components that provide the described functionality;or a combination of some or all of the above, such as in asystem-on-chip.

The term code, as used above, may include software, firmware, and/ormicrocode, and may refer to programs, routines, functions, classes,and/or objects. The term shared processor encompasses a single processorthat executes some or all code from multiple modules. The term groupprocessor encompasses a processor that, in combination with additionalprocessors, executes some or all code from one or more modules. The termshared memory encompasses a single memory that stores some or all codefrom multiple modules. The term group memory encompasses a memory that,in combination with additional memories, stores some or all code fromone or more modules. The term memory may be a subset of the termcomputer-readable medium. The term computer-readable medium does notencompass transitory electrical and electromagnetic signals propagatingthrough a medium, and may therefore be considered tangible andnon-transitory. Non-limiting examples of a non-transitory tangiblecomputer readable medium include nonvolatile memory, volatile memory,magnetic storage, and optical storage.

The apparatuses and methods described in this application may bepartially or fully implemented by one or more computer programs executedby one or more processors. The computer programs includeprocessor-executable instructions that are stored on at least onenon-transitory tangible computer readable medium. The computer programsmay also include and/or rely on stored data.

What is claimed is:
 1. A memory system comprising: a cache memoryconfigured to store data; a backend data store; a duplicate removingfilter separate from the cache memory and configured to: receive readrequests and write requests for data blocks to be read from or writtento the cache memory; generate fingerprints for the data blocksassociated with the write requests and store at least one of thefingerprints in the duplicate removing filter; determine a match betweena fingerprint of a data block of a write request and the at least onestored fingerprints; send a logical block address of the write requestand a cache reference corresponding to the matched fingerprint to thecache memory in response to a single condition indicating that thefingerprint of the data block matches the at least one storedfingerprints; in response to receiving a write request as a write-backoperation, send the logical block address of the write request, thefingerprint of the write request and the data block to the cache memory;and in response to identifying a write request as a write-thruoperation, the duplicate removing filter is configured to send the datablock to the backend data store without performing duplicate removal. 2.The memory system of claim 1, wherein the cache memory is implemented asan integrated circuit.
 3. The memory system of claim 1, wherein thecache memory comprises a solid state drive.
 4. The memory system ofclaim 1, wherein the duplicate removing filter is configured to send oneof the data blocks associated with the write request, the logical blockaddress of the write request and the fingerprint of the write request tothe cache memory when the fingerprint of the write request does notmatch any of the stored fingerprints.
 5. The memory system of claim 4,wherein the cache memory is configured to send a cache reference to theduplicate removing filter in response to the duplicate removing filtersending the data block, the logical block address and the fingerprint tothe cache memory.
 6. The memory system of claim 4, wherein the cachememory is configured to maintain metadata including the fingerprint andthe logical block address of the one of the data blocks associated withthe write request.
 7. The memory system of claim 1, wherein theduplicate removing filter increments a counter associated with the oneof the stored fingerprints when the fingerprint of the write requestmatches one of the stored fingerprints.
 8. The memory system of claim 1,wherein the duplicate removing filter is configured to read one of thedata blocks associated with a read request from the cache memory when aread hit occurs.
 9. The memory system of claim 1, wherein the duplicateremoving filter is configured to: read one of the data blocks associatedwith a read request from a backend data store when a read miss occurs;send the one of the data blocks associated with the read request fromthe backend data store to an application of a host device; generate afingerprint for the one of the data blocks associated with the readrequest; and send the one of the data blocks associated with the readrequest, a logical block address associated with the read request andthe fingerprint for the one of the data blocks associated with the readrequest to the cache memory.
 10. The memory system of claim 9, whereinthe cache memory is configured to send a cache reference to theduplicate removing filter in response to the duplicate removing filtersending the one of the data blocks associated with the react request,the logical block address associated with the read request and thefingerprint for the one of the data blocks associated with the readrequest to the cache memory.
 11. The memory system of claim 1, whereinthe fingerprints uniquely identify the data blocks and when the cachememory evicts one of the data blocks, the cache memory identifieswhether the one of the data blocks is a duplicate or unique and sends acache reference and the fingerprint corresponding to the one of the datablocks to the duplicate removing filter.
 12. The memory system of claim11, wherein when the one of the data blocks to be evicted is aduplicate, the cache memory instructs the duplicate removing filter toreduce a counter associated with the one of the data blocks.
 13. Thememory system of claim 11, wherein when the one of the data blocks to beevicted is unique the cache memory is configured to: instruct theduplicate removing filter to reduce a counter associated with the one ofthe data blocks; and remove the one of the data blocks from the cachememory.
 14. The memory system of claim 1, wherein, in response toidentifying the write request as the write-thru operation, the duplicateremoving filter is further configured to send the logical block addressof the write request, the fingerprint of the write request and the datablock to the cache memory without performing duplicate removal.
 15. Thememory system of claim 1, further comprising: an input connectionconfigured to receive the write request as either the write-back orwrite-thru operation from an application on a host device.
 16. A methodfor operating a memory system comprising: separating a duplicateremoving filter from a cache; and in the duplicate removing filter:receiving read requests and write requests for data blocks to be readfrom or written to the cache memory; generating fingerprints for thedata blocks associated with the write requests and storing at least oneof the fingerprints in the duplicate removing filter; and determine amatch between a fingerprint of a write request and the at least one ofthe stored fingerprints; send a logical block address of the writerequest and a cache reference corresponding to the matched fingerprintto the cache in response to a single condition indicating that thefingerprint of the data block matches the at least one storedfingerprints; in response to receiving a write request as a write backoperation, send the logical block address of the write request, thefingerprint of the write request and the data block to the cache memory;and in response to identifying the at least one request is for awrite-thru operation, the duplicate removing filter sends the first datablock to a backend data store without performing duplicate removal. 17.The method of claim 16, further comprising sending one of the datablocks associated with the write request, the logical block address ofthe write request and the fingerprint of the write request to the cachewhen the fingerprint of the write request does not match any of thestored fingerprints.
 18. The method of claim 17, further comprisingsending a cache reference to the duplicate removing filter in responseto the duplicate removing filter sending the data block, the logicalblock address and the fingerprint to the cache.
 19. The method of claim17, further comprising maintaining metadata including the fingerprintand the logical block address of the one of the data blocks associatedwith the write request.
 20. The method of claim 16, further comprising:reading one of the data blocks associated with a read request from abackend data store when a read miss occurs; sending the one of the datablocks associated with the read request from the backend data store toan application of a host device; generating a fingerprint for the one ofthe data blocks associated with the read request; and sending the one ofthe data blocks associated with the read request, a logical blockassociated with the read request address and the fingerprint for the oneof the data blocks associated with the read request to the cache. 21.The method of claim 16, further comprising: receiving the write requestas either the write-back or write-through operation from an applicationon a host device.